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Abstract 

Background: A protein binding hot spot is a cluster of residues in the interface that are energetically important 
for the binding of the protein with its interaction partner. Identifying protein binding hot spots can give useful 
information to protein engineering and drug design, and can also deepen our understanding of protein-protein 
interaction. These residues are usually buried inside the interface with very low solvent accessible surface area 
(SASA). Thus SASA is widely used as an outstanding feature in hot spot prediction by many computational 
methods. However, SASA is not capable of distinguishing slightly buried residues, of which most are non hot spots, 
and deeply buried ones that are usually inside a hot spot. 

Results: We propose a new descriptor called "burial level" for characterizing residues, atoms and atomic contacts. 
Specifically, burial level captures the depth the residues are buried. We identify different kinds of deeply buried 
atomic contacts (DBAC) at different burial levels that are directly broken in alanine substitution. We use their 
numbers as input for SVM to classify between hot spot or non hot spot residues. We achieve F measure of 0.6237 
under the leave-one-out cross-validation on a data set containing 258 mutations. This performance is better than 
other computational methods. 

Conclusions: Our results show that hot spot residues tend to be deeply buried in the interface, not just having a 
low SASA value. This indicates that a high burial level is not only a necessary but also a more sufficient condition 
than a low SASA for a residue to be a hot spot residue. We find that those deeply buried atoms become 
increasingly more important when their burial levels rise up. This work also confirms the contribution of deeply 
buried interfacial atomic contacts to the energy of protein binding hot spot. 




Background 

Protein-protein interactions are dominated by hydrogen 
bonds, salt bridges and hydrophobic contacts across the 
interface [1,2]. These local interactions have to be desol- 
vated, densely packed, and hence deeply buried to make 
contribution to the binding free energy [3-6]. This is 
why the energetically important hot spot residues in the 
interface tend to cluster into local regions with low sol- 
vent accessible surface area (SASA) values [7,8]. 
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Identifying these energetically important residues, 
which can offer useful information to protein engineer- 
ing and better understanding of protein-protein interac- 
tion [9], is usually done by site-directed alanine 
mutagenesis. This experimental method mutates the tar- 
get residue into alanine which only has a C'^ heavy atom 
on its side-chain [10,11]. A residue whose mutation 
results in a large binding free energy change (>2.0 kcal/ 
mol, for example) is defined as a hot spot residue [12]. 

Many feature-based [13-17] energy-based [18-23] and 
even feature-and-energy-combined [24,25] computa- 
tional approaches have been proposed to address the 
hot spot prediction problem. Almost all of these 
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feature-based methods use SASA information of the 
residue as a critical feature in the prediction. A low 
SASA is necessary for a residue to be a hot spot residue; 
however, it is not sufficient, as a large number of non 
hot spot residues also have low SASA values. Therefore, 
SASA is not effective for distinguishing between slightly 
buried residues — a large part of which are non hot spot 
residues — and deeply buried residues that are very likely 
to be hot spot residues. 

In this work, we introduce a new descriptor for pro- 
tein atoms and residues. It is named "burial level". In 
the definition of burial level, the buried immobilized 
water molecules are treated as an integral part of the 
protein complex. We show that our definition of residue 
burial level is nicely correlated to AAG. A high burial 
level is not only in general necessary for hot spot resi- 
dues but also more sufficient for them in comparison to 
SASA. In other words, most hot spot residues tend to 
have high burial level while most non hot spot residues 
are exposed or just slightly buried. We also define the 
burial level of atomic contacts and we determine the 
number of three types of buried interfacial atomic con- 
tacts at different burial level that are directly broken 
when the residue is substituted by alanine. The number 
of those deeply buried atomic contacts together with the 
burial level of the residue itself are further fed into SVM 
as features to classify interfacial residues into hot spot 
residues or non hot spot residues. We name this SVM- 
based model DBAC since the features used are mainly 
based on the Deeply Buried Atomic Contacts. By apply- 
ing our method to a data set of 258 mutations, we 
achieve an F measure of 0.6237 under the hot spot defi- 
nition of AAG > 2.0 kcal/mol, which is better than other 
computational methods. We also conduct a detailed 
analysis of the features used in this work; and we find 
that hot spot residues tend to have significantly more 
deeply buried atomic contacts than non hot spot 
residues. 

Methods 

Data set 

Our data set is collected by retrieving the experimental 
alanine mutagenesis data from the alanine scanning 
energetics database (ASEdb) [26] and other previously 
published works [27-31]. We require that: the 3D struc- 
ture of the wild-type protein complex is solved by X-ray 
crystallography and is reported in PDB [32], and the 
associated solvent information is also included in the 
PDB file. We do not consider protein-ligand interaction 
or protein-peptide interaction in this work; thus those 
interactions without an extended interface are excluded. 
The reason is that the interfaces of protein-ligand inter- 
actions are small and most interfacial residues are 
exposed in the solvent to a certain degree; thus the 



burial levels of the atoms are mostly very low and imply 
little information. The structural similarity of the com- 
plexes are tested by the CE algorithm [33]. If the two 
chains of the two complexes have a significant similarity, 
their binding interfaces are further examined to ensure 
that there is no redundancy in the data set. Further- 
more, only mutations in the interface are considered. 
We used another version of this data set in our previous 
work [34], where the requirement that the mutations 
have to be in the interface was not applied. 

Our data set in this work consists of 258 mutations 
distributed in 13 protein complexes. Hot spot residues 
are usually defined by setting AAG > 1.0 kcal/mol or 
AAG > 2.0 kcal/mol. We prefer the second choice, as 
only a higher AAG threshold can reflect the direct influ- 
ence of the mutation. That is, the interfacial atomic con- 
tacts that are directly broken by the mutation are taken 
into consideration with more weights. Under the AAG > 
2.0 threshold, there are 50 hot spot residues and 208 
non hot spot residues in our data set. Some researchers 
even suggested that a residue should have a AAG higher 
than 4.0 kcal/mol so as to have a strong impact on the 
binding of the two proteins [9]. In practice, a lower 
value has to be considered in order to get enough data 
for statistical analysis [9]. 

The data set is available at http://155.69.2.25:8080/ 
DBAC data. 

Feature generation 

Burial level for an atom, residue or an atomic contact 

Our definition of burial level is based on atomic contact 
graph. The atomic contact graph of a protein complex 
is an undirected graph with heavy atoms as nodes and 
atomic contacts as edges. The atoms in this graph are 
labeled as exposed or buried according to its SASA. If 
the SASA of an atom is not less than 10. OA ^, it is 
exposed, otherwise it is buried. SASA is calculated by 
the NACCESS software based on the Lee-Richards algo- 
rithm [35]. All the exposed water molecules, which we 
consider as a part of the bulk solvent, are removed, 
while the buried water molecules are kept as a part of 
the complex. Thus the oxygen atoms of the buried 
water molecules are a part of the atomic contact graph. 

The atomic contact is defined by a distance threshold 
and the Voronoi diagram. Voronoi diagram decomposes 
the 3D space into cells by assigning every point in the 
space to its nearest neighboring input site. Here in this 
work, the input sites are the positions of the atoms in 
the complex structure. If two atoms' Voronoi cells are 
adjacent to each other, they are somehow "sheltering" 
each other. We define the atomic contact by adding an 
extra distance requirement to Voronoi diagram. Two 
atoms are considered to be in contact if they have a dis- 
tance less than their Van der Waals radius plus the 
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diameter of a water molecule (2.75 A) and they share a 
Voronoi facet. We actually used the Delaunay diagram 
that is dual to Voronoi diagram. Two facet-sharing Vor- 
onoi cells correspond to two connected points in the 
Delaunay diagram. The Delaunay diagram is calculated 
by using the qdelaunay program in qhull [36]. This dis- 
tance threshold, 2.75 A, is based on a water-free idea 
and it has been used in [37]. 

In an atomic contact graph, the burial level of an atom 
is defined as the length of the shortest path from this 
atom to its nearest exposed atom. For example, the bur- 
ial level of exposed atoms is 0 and the burial level of 
their immediate buried neighbors is 1. We calculate the 
burial levels by adding a pseudo node, which represents 
bulk solvent, to the atomic contact graph. This node is 
connected to all of the exposed nodes directly. Then the 
burial level of any atom equals to the length of the 
shortest path from this atom to the pseudo node minus 
1. This is exactly the single-source-shortest-path pro- 
blem and it can be solved using Dijkstra's algorithm 
[38]. 

The reason for using a Voronoi-diagram-combined 
definition of atomic contact is as follows. If only dis- 
tance information is used, there will be many false 
atomic contacts in the atomic contact graph whose two 
atoms cannot contact with each other at all (as they do 
not share Voronoi facet thus they are spatially blocked 
by other atoms), and the atomic contact graph will be a 
trivial discretization of the Euclidean distance between 
atoms, and the atom burial level will only depend on 
the distance of the atom to the surface of the complex, 
especially when a large distance threshold (2.75 A) is 
used. Adding Voronoi diagram to the definition makes 
the burial level depend also on the organization of the 
atoms inside. Intuitively, the burial level of atoms in a 
protein complex depends on the size of the protein 
complex. In general, the larger the protein complex is, 
the more deeply buried atoms there are. Burial level also 
depends on the shape of the interacting proteins. For 
example, globular proteins and protein complexes gen- 
erally have more deeply buried atomic contacts than 
those with other shapes. 

Note that the calculation of burial level requires infor- 
mation on buried water molecules. In our previous work 
[34], we have systematically analyzed the contribution of 
water molecules to the calculation of burial level as well 
as to protein binding hot spots. 

Figure 1 shows a burial level pattern inside a growth 
hormone and growth hormone receptor complex. As 
seen from the figure, atom burial level is indeed a good 
indicator to describe the extent to which an atom is 
buried inside a protein or a protein complex. It is clear 
that the burial level of any two neighboring atoms can 
have a difference of at most 1. Because the complex is 




Figure 1 Cross-section of a growth hormone and growth 
hormone receptor complex Cross-section of a growth hormone 
and growth hormone receptor complex [PDB:1A22] showing the 
burial level pattern. A layer of atoms in the front are shown as 
spheres while those at the back are shown as sticks. The colors 
from green to red indicate the burial level of atoms from 0 to 4. 
The largest burial level is 4 here. The largest burial level of atom 
within a protein/protein complex depends on its size, shape and 
the nearby atoms' organization. 

not perfectly globular, burial level 2 is "thicker" with 
more atoms. 

The burial level of a residue is the average value of the 
burial levels of all atoms in the residue. For an atomic 
contact, if the burial levels of the two atoms are the 
same, the burial level of the atomic contact is taken as 
the burial level of the two atoms, otherwise it is defined 
as the smaller one of the two burial levels. The differ- 
ence of the burial levels of two contacting atoms is at 
most one. 

There are some existing concepts that are related to 
burial level. In [39-43], the authors defined their concept 
of depth of an atom as the Euclidean distance to the 
closest exposed atom or to the closest surface water 
molecule. There are even some sequence-based methods 
[44-46] that are capable of predicting its value. This 
definition is only based on the Euclidean distance and 
hence it cannot capture the contacts between atoms or 
the organization of the atoms. And the calculation of 
the shortest distance contains an exhaustive search 
among all exposed atoms or surface water molecules for 
every buried atom, while our graph-based concept of 
burial level can be easily calculated by transferring the 
calculation into a single-source-shortest-path problem. 
Directly broken atomic contact, atomic contact types and 
features 

When a residue is mutated into alanine, some interfacial 
atomic contact are directly broken because of the 
removal of certain atoms; Some other interfacial atomic 
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contacts may also be broken or distorted due to the 
conformational change in the local region [31]. Given a 
residue, we consider its directly broken interfacial 
atomic contacts, namely those contacts formed by an 
atom other than C, N, O, C" or C (these atoms are the 
remainder after a residue is mutated into alanine) and 
any contact partner from the other chain, even its back- 
bone atoms. For example, for a serine in a complex, the 
atomic contact between its and a C" of any residue 
from the interacting partner is a directly broken atomic 
contact of this serine, while the contact between its 
and whatever atom from the interacting partner is not. 
We classify the atomic contacts into three types. If a 
contact is between a positively charged atom and a 
negatively charged atom, which usually corresponds to a 
salt bridge, it is called a Type-I contact. If a contact is 
between a hydrogen bond donor and a hydrogen bond 
acceptor, which usually is a hydrogen bond, it is classi- 
fied as Type-II contact. Contacts that are neither Type-I 
nor Type-II are classified as Type-Ill. Here, the defini- 
tions for positively charged atoms, negatively charged 
atoms, hydrogen bond donors and hydrogen bond 
acceptors are as given in [8]. We do not further divide 
the Type-Ill contacts into subtypes such as other polar 
contact, hydrophobic contact and so on because they 
are all not as specific as Type-I and -II contacts. Note 
that the definitions for Type-I and Type-II contacts are 
not exactly the same as salt bridges and hydrogen bonds 
in terms of geometrical requirements, yet they can be 
still very important [47]. 

In this work, we use deeply buried atomic contacts 
whose burial level is not less than 2. We refer to atomic 
contacts at burial level 0 as exposed atomic contacts 
and those at burial level 1 as slightly buried atomic con- 
tacts. Let C(i, J) denote the number of Type-i directly- 
broken interfacial atomic contacts at burial level / of a 
residue. Then our model uses 6 features to describe a 
residue: C(I, > 2), C{II, 2), C{II, > 3), C{III, 2), C{III, > 
3), plus the burial level of the residue. An SVM model 
based on this feature set is named DBAC (Deeply Bur- 
ied Atomic Contacts). For comparison, we have also 
built another model named AC (Atomic Contacts) 
based on another feature set comprising C(/, 0), C(I, 1), 
C(I, > 2), C{II, 0), C{II, 1), C(II, 2), C(II, > 3), C(III, 0), C 
(///, 1), C{III, 2), C(III, > 3), and the burial level of the 
residue. The maximum value of burial level depends on 
the size of the protein complexes, the size of the inter- 
faces as well as the shape of the complex and the orga- 
nization of the atoms. In general, very few contacts have 
burial level larger than 3, so we do not distinguish 
further burial levels larger than 3. For Type-I contact, 
there are very few cases that have burial level larger 
than 2, thus we do not use C{I, > 3) as a feature but 
merge it with C(/, 2) into C(/, > 2). 



SVM training-testing protocol 

Support Vector Machines (SVMs) are widely used in 
many classification and regression problems. They have 
also been adopted in hot spots prediction problems 
[15,17,24] with various feature sets and training-testing 
protocols. In this work, we use the LIBSVM software 
[48], which is a tool for SVM model training and testing 
available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. 
We use the radial basis function (RBF) as the kernel. 
We do not conduct feature selection because our 
method is straightforward, and the number of features is 
not large. However, we evaluate performance on two 
different feature sets: the deeply buried atomic contacts 
only (by DBAC) and all the atomic contacts (by AC). 
The latter feature set is evaluated just for comparison. 

The performance is evaluated under leave-one-out 
cross-validation. To avoid over fitting, we have strictly 
followed a nested-loop cross-validation procedure. 
There are 258 mutations in our data set, each time one 
mutation is taken as the test data and the remaining 
257 mutations are used to train the model. The two 
parameters, namely cost and gamma, are optimized on 
the training data by a grid search. The grid search evalu- 
ates the performance, F measure, of SVMs with different 
parameter values on the training data using 5-fold cross- 
validation, and the parameter values with the best per- 
formance are chosen to build a training model on the 
training data. This training model is then applied to the 
test data, that is, the mutation held out in advance. This 
procedure is repeated 258 times until every mutation in 
the data set is tested. 

Metrics in performance evaluation and statistical analysis 

Performance is measured by sensitivity, precision, speci- 
ficity, accuracy and F measure (Fl). These measures are 



defined asfollows: sensitivity = 



TP 



TP + FN 



precision = 



TP 



accuracy - 



TP + TN 



TP + FP + TN + FN 



accuracy - 



TP + FP 
TP + TN 
TP + FP + TN + FN ' 



P^^ lxsensitintyxpreaston ^^^^^ 
sensitivity + precision 
and FN are the number of true positives, false positives, 
true negatives and false negatives, respectively. A better 
classifier should predict hot spot residues with less false 
positives and less false negatives; thus the F measure, 
which combines sensitivity and specificity, is used to 
indicate the overall performance. 

We also test the significance of the difference in AAG 
values of predicted hot spot and non hot spot residues. 
A classifier divides the mutations in the data set into 
two groups: computational hot spot residues and com- 
putational non hot spot residues. The significance of the 
AAG value difference in these two groups are tested by 
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Mann-Whitney test [49]. The result of a classifier with 
higher Fl value can be less significant when its false 
positives have very low AAG values (near 0 kcal/mol or 
even negative) and the false negatives have high AAG 
values. 

We also examine the value distribution of individual 
features in hot spot and non hot spot residues. The sig- 
nificance of the difference in the two classes is also 
tested by Mann- Whitney test. 

Results and Discussion 

Performance of hot spot residue prediction 

As introduced, 5 new features are derived from those 
deeply buried interfacial atomic contacts which are 
directly broken by alanine substitution. The feature 
values of a residue are then fed into SVM together with 
the overall residue burial level to predict whether this 
residue is a hot spot residue or not. The performance 
under leave-one-out cross-validation is shown in the 
second row of Table 1. We achieve an F measure of 
0.6237, when AAG > 2.0 is used as the threshold to 
define hot spot residues. The precision of our method is 
higher than the recall, which means that there are fewer 
false positives than false negatives. A reason for this is 
that our model emphasizes the contribution of directly 
broken atomic contacts. The contacts that are broken or 
newly formed by the conformational change during the 
mutation are hard to define quantitatively. The AAG of 
some hot spot residues, whose mutation results in a 
large conformational change, cannot be fully explained 
by its directly broken atomic contacts. This is reflected 
in the lower sensitivity value. The non hot spot residues, 
whose AAG is low, tend to have fewer deeply buried 
directly broken atomic contacts, leading to a smaller 
number of false positives and hence a higher precision. 
We have also evaluated the performance of using the 
AC feature set containing 12 features, which take 
exposed and slightly buried atomic contacts into consid- 
eration as well. As shown in the third row of Table 1, 
the performance is not improved even though extra 
exposed and slightly buried atomic contacts are added 
to the feature set. Rather, the F measure is driven down 
to 0.4 and the statistical significance is reduced a lot as 
well. The reason is presented later. 

Table 1 Performance of our method (DBAC) in 
comparison with using all atomic contact (AC) and 
Robetta 



Method Sensitivity Precision Specificity Accuracy Fl p-value 



DBAC 


0.58 


0.6744 


0.9327 


0.8643 


0.6237 


3.0280X- 


10" 


AC 


0.32 


0.5333 


0.9327 


0.8140 


04 


1.2849X- 


10" 


Robetta 


0.44 


0.3667 


0.8173 


0.7442 


04 


5.381 7x' 


10" 


FoldX 


0.5 


0.4630 


0.8606 


0.7907 


0.4808 


6.2451 X' 


10" 



The reason we use leave-one-out cross-validation is 
that we have a small data set and, moreover, there are 
only a small number of positive samples (hot spot resi- 
dues). To test the robustness of our method, we evalu- 
ate performance using leave-«-out cross-validation 
under the same training-testing procedure. We find that 
when « is not large (< 7), the performance change is not 
significant, sometimes better (Fl=0.6304, «=5) and 
sometimes worse (Fl=0.5934. «=3) than that by using 
leave-one-out. Anyway, as shown later, no matter how 
large « is, our performance is always better than Robetta 
and FoldX. For example, the performance of our 
method by 5-fold (leave-51-out) cross-validation is 
0.5870 in F measure. This indicates that our method is 
robust. 

We compare our method with three energy-based 
methods, Robetta [19,21], FoldX [20,50] and EGAD [22]. 
Robetta is an online service. It can be used to predict 
the AAG value of interfacial residues by computational 
alanine scanning based on an energetic function. It can 
thus be applied to hot spot prediction. Actually, it is a 
widely recognized gold standard for benchmark compar- 
ison in the field. Its performance on our data set is 
shown in the forth row of Table 1. Our performance is 
remarkably better than that of Robetta in terms of both 
Fl and p-value. 

FoldX is also available online. It is able to predict the 
change in both protein stability and affinity. Its energy 
function contains different sources of contributions such 
as van der Waals interactions, hydrogen bonds and even 
water-bridged interactions. We calculated the AAG of 
mutations in our data set by using FoldX version 3.0 
beta 4 with default parameters. From the fifth row of 
Table 1, although FoldX shows a better performance 
than Robetta, probably due to the fact that it is being 
updated, our method still achieves a better performance 
than that by FoldX. 

EDAG contains a group of energy functions for pro- 
tein design and it has a module to predict the change in 
binding free energy after a mutation. We compare our 
method with EGAD based on the data that is common 
between our data set and those reported in their paper 
[22], namely 166 mutations (34 hot spot residues and 
132 non hot spot residues). As shown in Table 2, our 
method's performance is significantly better than that of 
EGAD. In fact, EGAD tends to underestimate the AAG 
values — especially for the barnase-barstar complex 
[PDB:1BRS] in which there are many hot spot residues 

Table 2 Comparison of our method with EGAD 

Method Sensitivity Precision Specificity Accuracy Fl p-value 

DBAC 0.5294 0.6923 0.9394 0.8554 0.6 3.1412x10"'' 

EGAD 0.3235 0.5 0.9167 0.7952 0.3929 1.3693x10"* 
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— thus many hot spot residues are misclassified as non 
hot spot, resulting in poor sensitivity. 

These energy-based method are complicated and 
time-consuming. The energy functions usually contain 
many terms that represent different kinds of energies. 
Both binding and folding of proteins can affect the bind- 
ing free energy between two proteins. But binding and 
folding are very complicated processes whose details are 
difficult to capture. When a residue is mutated into ala- 
nine, the new structure of the mutated protein and the 
mutated protein complex must be predicted to get the 
values of all energy terms of the mutated structure, 
which is also very difficult. Thus the AAG are hard to 
be accurately estimated even by these complicated 
energy functions. From Tables 1 and 2, the performance 
of these energy-based methods are not very good yet. 

We also compare our method with another machine 
learning method, MINERVA [15], which uses SVM as 
well and is based on a larger feature set containing var- 
ious aspects of information of target residue such as 
weighted atomic packing density, relative surface area 
burial, weighted hydrophobicity and so on. 

MINERVA has good performance in terms of Fl value 
in comparison with other previous machine-learning 
methods. Because its source codes and software are not 
available, we compare the performance on data that are 
common between the MINERVA paper and our data 
set. This common data set contains 178 mutations with 
36 hot spot residues and 142 non hot spot residues. It 
can be seen from Table 3 that MINERVA does not out- 
perform our method in terms of F measure or p-value. 
Moreover, the reported performance of MINERVA may 
be biased upwards by an overfitting effect because 
MINERVA uses 12 features selected from a total of 54 
basing on the whole data set. MINERVA had also been 
tested on an independent data set derived from BID 
[51], where the importance of a residue is labeled as 
"strong", "intermediate", "weak" or "insignificant". In 
that test, MINERVA treated a residue as a hot spot resi- 
due only when its label is "strong". We also test our 
model (trained on our data set, 258 mutations) on a 
subset of this independent data set containing 111 
mutations whose PDB files have associated solvent 
information. The performance of our model in terms of 
Fl on this data set is 52%, which is same as that of 
MINERVA. Note that the label of a residue is not per- 
fectly correlated to its AAG, which is an inconsistency 
between the training data and the test data; however, it 

Table 3 Comparison of our method with MINERVA 

Method Sensitivity Precision Specificity Accuracy Fl p-value 
DBAC 0.5833 0.7 0.9366 0.8652 0.6364 2.4011x10"'' 

MINERVA 0.5 0.6667 0.9366 0.8483 0.571 1.3731x10"' 



Table 4 Statistical analysis on the features. 



Feature 


RBL* 


C(l, 0) 


C(l, 1) 




C(l, > 2) 


p-value 


5.0897 X ia'° 


0.2008 


0.8204 




0.0050 


Feature 


C(ll, 0) 


C(ll, 1) 


a//, 2} 




C(ll, > 3) 


p-value 


0.0013 


04133 


12419 X 


la" 


3.5031 X la'' 


Feature 


C(lll, 0) 


C(lll,1) 


C(ll!, 2) 


10-'" 


QUI, > 3) 


p-value 


0.0034 


0.0945 


1.9061 X 


1.1621 X 10''-' 



The features in the feature set DBAC are emphasized in italics 



*:residue burial level 

still can indicate the contribution of a residue to 
binding. 

Feature analysis 

We test the significance of the difference of the values 
of a feature in the hot spot and non hot spot residues. 
The p-values are reported in Table 4. It is clear that the 
DBAC features have very low p-values, indicating that 
their values are significantly different between the two 
classes. The p-value of C(/,> 2) is not as low as that of 
other DBAC features, because there are fewer residues 
that have salt bridges. The numbers of slightly buried 
atomic contacts are not as significant as those of the 
deeply buried ones. Thus our feature set can indeed 
reflect the contrast between hot spot residues and non 
hot spot residues, and the idea of excluding slightly bur- 
ied and exposed atomic contacts and using only deeply 
buried atomic contacts is statistically reasonable. 

We show in Table 5 the performance of our method 
after one feature is removed under the same training- 
testing protocol described in Methods. It seems that the 
removal of residue burial level, C{I, > 2), C(II, 2) or C(II, 
> 3) has little impact on performance. A reason is that 
the features we are using are somehow correlated with 
each other. In general, when an interfacial residue is 
deeply buried with high residue burial level, it has sev- 
eral deeply buried atomic contacts with the other side. 
On the other hand, the removal of C{III, 2) or C(III, > 
3) reduces performance a lot. The reason is that there 
are often many Type-Ill atomic contacts in hot spot 
residues, as Type-Ill atomic contacts is not as specific as 
Types I or II. This suggests that we can further divide 
Type-Ill atomic contacts into subtypes. We have actu- 
ally tried dividing Type-Ill contacts into other polar- 
polar contacts and hydrophobic contacts; and it turns 
out that the performance change is not much and 
hydrophobic contacts become the new dominant one. 
However, this does not mean residue burial level, C(I, > 

Table 5 Performances of our method after one feature is 
removed 

Feature Removed RBL C(l, > 2) C(ll, 2) C(ll, > 3) C(lll, 2) C(lll, > 3) 
Fl 0.6 0.6067 0.6237 0.6 0.4828 0.3947 
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2), C{II, 2) and C(II, > 3) do not contribute to the per- 
formance because all the six features have significantly 
different values in hot spot and non hot spot residues. 
In fact, we can achieve an F measure of 0.5 with only 
the residue burial level. 

Another interesting observation from Table 5 is that 
the performance after the removal of C(II, 2) (or C{III, 
2) ) is better than that after the removal of C{II, > 3) (or 
C(III, > 3)), although the difference between hot spot 
and non hot spot residues in C(II, 2) or C{III, 2) is more 
significant (as shown in Table 4) and the number of 
residues that have Type-II or Type-Ill atomic contacts 
at burial level > 3 is lower (as shown in Figure 3). As 
we have already shown that the performance with fea- 
ture set AC is worse than that with DBAC, the observa- 
tion here further indicates that the atomic contact at 
deeper burial level (> 3) is more important in hot spot 
prediction. This again confirms that burial level plays a 
very important role. 
Residue burial level 

Residue burial level is a very important feature for pre- 
dicting hot spot residues. Its p-value shows a very signifi- 
cant difference between hot spot residues and non hot 
spot residues, as can be seen in Table 4. Here, we explain 
that residue burial level is more sufficient than SASA in 
hot spot prediction. Bogan and Thorn [7] found that hot 
spot residues tend to have low SASA values. Based on 
this observation, they suggested the existence of a ring of 
energetically less important residues that are responsible 
of protecting the hot spot. Generally, a low SASA value is 
a necessary condition for a residue to become a hot spot 
residue. Thus, it is usually used someway for hot spot 
prediction. For example, the HotSprint database [13] 
defines computational hot spots as those conserved 



residues that have large SASA change (ASASA) in 
complex formation and low SASA in the complex. 
However, a low SASA value is not a sufficient condition 
for a hot spot. As observed in Figure 2(a), in our data set, 
hot spot residues tend to have low SASA values with 
more than 80% of hot spot residues having SASA less 
than 30 A^. But non hot spot residues also follow such a 
tendency (55%), albeit in a less remarkable yet observable 
way. We have tried to incorporate SASA into our model, 
by adding it to the feature set. The performance drops a 
lot, only 0.3188 by F measure. 

In contrast, as shown in Figure 2(b), hot spot residues 
tend to have a high burial level, while non hot spot resi- 
dues do not. More than 60% of hot spot residues have a 
burial level no less than 2.0, whereas less than 20% of 
non hot spot residues have such burial levels. Thus, we 
conjecture that a high burial level is not only necessary 
but also more sufficient than a low SASA value for a 
hot spot residue. 
Deeply buried atomic contacts 

Type-I atomic contacts roughly correspond to salt 
bridges. Some researchers believe that buried salt 
bridges provide neutral or even negative contribution to 
protein stability [52,53] because the desolvation of 
charged groups requires more energy than the interac- 
tion energy of the formation of the salt bridge [54]. But 
in protein-protein interaction, it is found that interfacial 
salt bridges are more buried than intra-chain salt 
bridges, and the salt bridges are found favorable across 
the interface [3]. Perhaps the two proteins are folded 
independently with more charged residues exposed and 
their conformation change during complex formation is 
very restricted, thus the two proteins prefer to interact 
in an electrostatic complementary manner. 
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Figure 2 Distribution of the SASA values (a), and distribution of the residue burial levels (b), in the hot spot and non hot spot 
residues. The two figures here show a comparison between the distribution of the values of two residue descriptors: SASA that is usually used 
in hot spot prediction, and burial level that is proposed by us, in hot spot and non hot spot residues. In (a), the distribution of SASA is shown, in 
which we can find that both the number of hot spots and the number of non hot spots go down as SASA gets larger In (b), as the burial level 
gets larger, the number of non hot spot residues drops while the number of hot spot residues goes up. 
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spot residues that have at least one Type-I (a), Type-ll (b) or Type-Ill (c) directly broken interfacial atomic contact at different burial level. It can be 
noted that the values for deeply buried atomic contacts are significantly different in hot spot and non hot spot residues. 



Percentages of hot spot residues and non hot spot 
residues whose C{I, x) are larger than 0 are plotted in 
Figure 3(a). Generally, hot spot residues tend to have 
their salt bridges buried while non hot spot residues do 
not. Two adjacent exposed oppositely charged groups 
may not form stable salt bridges at all [4]; thus some 
exposed Type-I contacts, most of which are possessed 
by non hot spot residues, may not be stable salt bridges. 

Hydrogen bonds play a key role in protein-protein 
interaction [2]. Most interfacial hydrogen bonds are 
extremely buried; and the more buried a hydrogen bond 
donor/acceptor is, the more likely it is to form a hydro- 
gen bond [3]. Thus, being buried is favorable for interfa- 
cial hydrogen bonds. Figure 3(b) shows the percentages 
of hot spot residues and non hot spot residues whose C 
(//, x) are larger than 0. It can be seen that nearly 30% 
of non hot spot residues have exposed Type-II atomic 
contacts, but very few of them have deeply buried 
hydrogen bonds. The case is totally different in hot spot 
residues. There are more hot spot residues that have 
deeply buried Type-II atomic contact while a few of 
them have exposed ones. The number of residues that 
have extremely buried (burial levels 3) atomic contacts 
is limited by the size of the protein complexes. 

Type-Ill contacts contain all other kinds of contacts 
that are neither salt bridges nor hydrogen bonds, includ- 
ing hydrophobic contacts and other polar contacts. 
Actually hydrophobic contacts are not specific contact 
between atoms but are the packing of groups of hydro- 
phobic side chains. The contribution of hydrophobic 
contacts to bonding free energy is correlated with the 
buried surface area [6]. Thus energetically important 
hydrophobic contacts are those buried ones. Generally, 
protein-protein interfaces are dominated by salt bridges, 
hydrogen bonds and hydrophobic contacts; but some- 
times other contacts also make contribution to the bind- 
ing [3]. A hot spot is usually a densely packed region in 
the interface, thus the number of buried contacts of a 
hot spot residue tends to be large, which can be 



reflected by deeply buried Type-Ill contacts. As shown 
in Figure 3(c), more than 80% of hot spot residues have 
Type-Ill contact at burial level 2 and only about 20% 
non hot spot residues have Type-Ill contact at this bur- 
ial level. 

Case study: three residues that are difficult to classify 

Figure 4 shows the structure of two residues that are 
difficult to classify. ARG-17 of BPTI shown in Figure 4 
(a) is well buried in the interface of the complex with a 
very low SASA of 8.0A^, a small SASA value that is not 
even enough to define an exposed atom. Arginines are 
actually very likely to be hot spot residues [7,9], espe- 
cially when they have such a low SASA. However, this 
ARG-17 is a non hot spot residue, having a AAG of 
only 0.5 kcal/mol. Its burial level is 1.55, which is not a 
high value and, more importantly, almost all its atomic 
contacts with bovine chymotrypsin are just slightly bur- 
ied or even exposed. There are 4 Type-II contacts 
shown in the figure, with 2 exposed and 2 slightly bur- 
ied. It also has another 15 Type-Ill contacts, with 13 
slightly buried, 1 exposed and 1 deeply buried. We suc- 
cessfully classified this residue as a non hot spot residue. 

Another example as shown in Figure 4(b) is TYR-29 
of barstar. This residue is exposed with an SASA of 
64.12 A^; however, it is a hot spot residue with a AAG 
of 3.4 kcal/mol. There are only two hot spot residues 
that have SASA larger than 60A in our data set. We 
can still successfully identify it as a hot spot residue by 
using its deeply buried atomic contacts. The side-chain 
of tyrosine which contains an aromatic ring and a 
hydroxyl group is capable of forming aromatic 7r-inter- 
actions and hydrogen bonds [7]. As can be seen from 
the figure, although TYR-29 of barstar is partially 
exposed, its side-chain stretches into the complex and 
forms many deeply buried atomic contacts. For the 4 
Type-II interfacial atomic contacts shown in the figure, 
3 are deeply buried and 1 is slightly buried. There are 
another 8 deeply buried Type-Ill contacts, 7 of which 



Li et al. BMC Systems Biology 201 1, 5(Suppl 1):S5 
http://www.biomedcentral.eom/1752-0509/5/S1/S5 



Page 9 of 1 1 




Figure 4 Two residues that are difficult to classify in our data set. Two residues that are difficult to classify in our data set. In (a), ARG-17 of 
BPTI (in yellow) in bovine cfiymotrypsin (in green)-BPTI complex [PDB:1CBW] is sfiown. Tfiis residue is a non hot spot residue (AAG=0.5 kcal/mol) 
with an extremely low SASA (8.0 A^). In (b), TYR-29 of barstar (in yellow) in barnase (in green)-barstar complex [PDB:1BRS] is shown. This residue 
is a hot spot residue (AAG=3.4 kcal/mol), but it has a large SASA (64.12 A^). These two residues have been correctly classified by our method. 
The dashed lines in the figures are Type-ll atomic contacts (candidates of hydrogen bonds). 



are made by the aromatic ring and 5 are atomic contacts 
with HIS- 102, an active site residue of barnase [55]. 

In Figure 5, we show another residue — ASP-435 of 
ribonuclease inhibitor in ribonuclease inhibitor-angio- 
genin complex — that is in a "wet" local environment, 
although it is buried inside the interface and well 
wrapped. It is a hot spot residue with a AAG of 3.5 





Figure 5 A case that only our method works. ASP-435, a residue 
of ribonuclease inhibitor (in yellow) in ribonuclease inhibitor- 
angiogenin (in green) complex [PDB:1A4Y] that we have successfully 
classified as a hot spot residue while Robetta, FoldX, EGAD and 
MINERVA failed. This residue has several nearby water molecules (in 
blue spheres), which may be the reason why other methods cannot 
predict its type successfully. These water molecules are buried water 
molecules are considered as part of the protein complex by our 
method. 



kcal/mol. As can be seen from the figure, this residue 
has several nearby water molecules, which may be the 
reason Robetta, FoldX, EGAD and MINERVA have 
failed to classify it as a hot spot residue. We consider 
buried water molecules as part of the protein complex; 
thus, the buried water molecules shown in the figure 
not only are buried but also are shelters of the nearby 
residues. This residue has 3 Type-I, 1 Type-II and 11 
Type-Ill deeply buried atomic contacts. If we do not 
consider buried water molecules as part of the com- 
plex, these atomic contacts will no longer be deeply 
buried. 

Conclusions 

We have proposed a feature-based method to predict 
protein-binding hot spots by using deeply buried interfa- 
cial atomic contacts that are directly broken during ala- 
nine substitution. The method is based on a graph 
theoretical definition of burial level of residues, atoms 
and atomic contacts. We achieved an F measure of 
0.6237 when AAG > 2.0 is used as the threshold to 
define hot spot residues. The burial level of a residue is 
more intuitive than the concept of SASA. It is nicely 
correlated with the AAG of a residue. We have shown 
that a high residue burial level is in general necessary 
for a residue to be a hot spot residue. Furthermore, it is 
more sufficient than SASA, a frequently used feature in 
existing hot spot prediction methods. Our results also 
reveal that hot spot residues tend to have deeply buried 
atomic contacts while non hot spots tend to have 
exposed and slightly buried ones. This is consistent with 
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previous studies that emphasize the energetic contribu- 
tion of buried salt bridges, hydrogen bonds and hydro- 
phobic contacts. 
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